Improve inferrability of getproperty(::Row2, ::Symbol) #753
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #752. This is a case of us not providing just not quite enough
information to the compiler, along with the compiler itself being too
clever. The default for
CSV.Rows
is to treat each column asUnion{String, Missing}
, which results in theV
type parameter ofCSV.Rows
beingCSV.PosLen
, instead ofAny
. If that's the case, weshould get pretty good inferrability for
getproperty(::Row2, ::Symbol)
, because we should be able to know the return value will atleast be
Union{String, Missing}
. This knowledge, however, was trappedin the "csv domain" and not expressed clearly enough to the compiler. It
inspected
Tables.getcolumn(::Row2, nm::Symbol)
and saw that it calledTables.getcolumn(::Row2, i::Int)
, which in turn calledTables.getcolumn(::Row2, T, i, nm)
. This is all fine an expected,except that when we started supporting non-String types for
CSV.Rows
(i.e. you can pass in whatever type you want and we'll parse it directly
from the file for each row), we added an additional typed
Tables.getcolumn
method that handled all the non-String columns. Oops.Now the compiler is confused because from
Tables.getcolumn(::Row2, nm::Symbol)
it knows that it can returnmissing
, aString
, or if wecall this third method, it'll return an instance of our
V
typeparameter, which, if you'll remember, in the default case is
CSV.PosLen
, or more simply,UInt64
. So we ended up with a returntype of
Union{Missing, UInt64, String}
, which makes downstreamoperations even trickier to figure out.
Luckily, the solution here is to just help connect the dots for the
compiler: i.e. define specialize methods that dispatch on
V
,specifically when
V === UInt64
. Then the compiler will see/know thatwe will only ever call the
Union{String, Missing}
method and canignore the custom types codepath. This PR also rearranges a few
@inbounds
uses since we can avoid the bounds checks further down thestack once we've checked them higher up.